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ABSTRACT Burkholderia pseudomallei causes the potentially fatal disease melioidosis. It is generally accepted that B. pseudomal- 
lei is a noncommensal bacterium and that any culture-positive clinical specimen denotes disease requiring treatment. Over a 
23-year study of melioidosis cases in Darwin, Australia, just one patient from 707 survivors has developed persistent asymptom- 
atic B. pseudomallei carriage. To better understand the mechanisms behind this unique scenario, we performed whole-genome 
analysis of two strains isolated 139 months apart. During this period, B. pseudomallei underwent several adaptive changes. Of 23 
point mutations, 78% were nonsynonymous and 43% were predicted to be deleterious to gene function, demonstrating a strong 
propensity for positive selection. Notably, a nonsense mutation inactivated the universal stress response sigma factor RpoS, with 
pleiotropic implications. The genome underwent substantial reduction, with four deletions in chromosome 2 resulting in the 
loss of 221 genes. The deleted loci included genes involved in secondary metabolism, environmental survival, and pathogenesis. 
Of 14 indels, 1 1 occurred in coding regions and 9 resulted in frameshift mutations that dramatically affected predicted gene 
products. Disproportionately, four indels affected lipopolysaccharide biosynthesis and modification. Finally, we identified a 
frameshift mutation in both P314 isolates within wcbR, an important component of the capsular polysaccharide I locus, suggest- 
ing virulence attenuation early in infection. Our study illustrates a unique clinical case that contrasts a high-consequence infec- 
tious agent with a long-term commensal infection and provides further insights into bacterial evolution within the human host. 

IMPORTANCE Some bacterial pathogens establish long-term infections that are difficult or impossible to eradicate with current 
treatments. Rapid advances in genome sequencing technologies provide a powerful tool for understanding bacterial persistence 
within the human host. Burkholderia pseudomallei is considered a highly pathogenic bacterium because infection is commonly 
fatal. Here, we document within-host evolution of B. pseudomallei in a unique case of human infection with ongoing chronic 
carriage. Genomic comparison of isolates obtained 139 months (11.5 years) apart showed a strong signal of adaptation within 
the human host, including inactivation of virulence and inmiunogenic factors, and deletion of pathways involved in environ- 
mental survival. Two global regulatory genes were mutated in the 139-month isolate, indicating extensive regulatory changes 
favoring bacterial persistence. Our study provides insights into B. pseudomallei pathogenesis and, more broadly, identifies paral- 
lel evolutionary mechanisms that underUe chronic persistence of all bacterial pathogens. 
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Burkholderia pseudomallei is a Gram-negative bacterium that 
causes melioidosis, a potentially fatal infectious disease that is 
frequently associated with at-risk individuals but can also cata- 
strophically afflict healthy people. Melioidosis is contracted by 
percutaneous inoculation, inhalation, ingestion, or aspiration fol- 
lowing environmental exposure to soil, water, or aerosols contain- 
ing B. pseudomallei (1). Although Southeast Asia and northern 
Australia have historically been considered "hot spots" of envi- 
ronmental B. pseudomallei presence, this organism is becoming 



increasingly recognized as an important cause of morbidity and 
mortality in other regions (2). Infection with B. pseudomallei can 
be difficult to treat and requires specific and protracted antibiotic 
therapy (1). 

The clinical picture of melioidosis is multifarious, with presen- 
tation and outcome determined by a combination of infecting 
dose, mode of infection, host risk factors, and yet-to-be- 
elucidated bacterial virulence determinants (3). Without a 
prompt diagnosis and access to appropriate antibiotics and state- 
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of-the-art intensive care facilities, the overall mortality rate is 
-40%, increasing to >90% in those presenting with septic shock 
( 1 ) . An ongoing prospective study of melioidosis cases in Darwin, 
Australia (3), has documented 815 culture-confirmed melioidosis 
cases since October 1989, of which 108 were fatal (13%). Approx- 
imately 85% of these patients presented with acute melioidosis 
following a presumed recent infection, 11% displayed chronic ill- 
ness (defined as symptoms lasting >2 months), and the remaining 
4% had delayed disease activation following latency (3). In all but 
one case, melioidosis survivors (n = 707) from the Darwin study 
eventually cleared their initial or relapsed B. pseudomallei infec- 
tions. This single individual, patient 314 (P314), has remained 
chronically colonized with B. pseudomallei since being diagnosed 
with melioidosis in 2000. 

Previous longitudinal studies of Burkholderia dolosa, Pseu- 
domonas aeruginosa, and Staphylococcus aureus isolates obtained 
from chronic infections of patients with cystic fibrosis have iden- 
tified convergent patterns of evolution that include decreased vir- 
ulence, increased antimicrobial resistance, and altered metabolic 
fitness due to similar selective pressures (4-8). Additionally, re- 
ductive evolution has been observed in a chronic P. aeruginosa 
infection (5) and is a common trait among obligate pathogens, 
including Burkholderia mallei, Rickettsia spp.. Chlamydia spp., 
and Mycobacterium spp. (9-11). These convergent adaptations 
suggest that some bacterial pathogens undergo key genetic 
changes in vivo during the transition to chronic disease, resulting 
in a substantial and sustained niche shift away from being an en- 
vironmental organism or acute pathogen. In the present study, we 
have applied next-generation whole-genome sequencing (WGS) 
technologies to analyze and accurately catalogue genome-wide 
alterations between the initial isolate (MSHR1043), a 37-month 
isolate (MSHR1655) and a 139-month isolate (MSHR6686) from 
P314. Our study of adaptation during chronic persistence sub- 
stantially adds to the current knowledge of bacterial evolution 
within the human host. 

RESULTS AND DISCUSSION 

Assembly and genomic features of the initial P314 isolate, 
MSHR1043. The Illumina-454 hybrid assembly of i?. pseudomallei 
MSHR1043 (GenBank accession no. AOGUOOOOOOOO) is in 44 
high-quality contigs totaling 7,221, 181bp and has a G + C content 
of 68.1%. Alignment of MSHR1043 lUumina reads with 
MSHR1043 resulted in >99.9% of the reads mapping to this as- 
sembly. Further contig joining was not possible with these data 
because of several large repetitive regions in B. pseudomallei, in- 
cluding 16S RNA, tRNA, Hep_Hag motifs, and variable-number 
tandem repeat loci. The average size and G+C content of the 
currently six closed B. pseudomallei genomes (1026b, K96243, 
MSHR668, 1710b, BPC006, and 1106a) are 7,178,622 bp and 
68.2%, respectively. The estimated size and G+C content of 
MSHR1043 suggest that this strain has not undergone large ge- 
netic changes in vivo. 

In contrast, 139-month isolate MSHR6686 exhibits four large 
deletions on chromosome 2, which reduce the size of the genome 
by 285 kb to -6.93 Mbp. Thirty-seven-month isolate MSHR1655 
is also missing three of these four deleted regions, totaling -245 kb 
(see Table SI in the supplemental material). This observation is 
consistent with the reductive evolution of bacterial genomes in 
response to a niche shift toward a more restricted and intimate 
long-term association with a eukaryotic host (9-11). This phe- 



nomenon is similar to that observed in B. mallei, a clonal species 
within the B. pseudomallei clade that contains a genome smaller 
than its B. pseudomallei ancestor because of the loss of several loci 
following its transition from a soil saprophyte to a restricted, pri- 
marily equine, niche (9). As expected, assembly of unmapped 
reads from MSHR6686 did not yield additional contigs, indicating 
that MSHR6686 has not acquired exogenous DNA. This observa- 
tion is consistent with previous in vivo evolution studies using 
closed bacterial genomes that have identified large-scale deletions 
but not large insertions (5, 12). The lack of MSHR1043 closure 
prevented comprehensive characterization of structural variation; 
however, alignment of MSHR1043 contigs against closed B. pseu- 
domallei genomes failed to identify major deletions or novel in- 
versions, suggesting overall colinearity of the genomes. All other 
mutations (single-nucleotide polymorphisms [SNPs], small 
[<15-bp] indels, and deletions) were readily identified. 

Genome-wide point mutations between the initial and 139- 
month P314 isolates show a strong signal of positive selection. 
Twenty-three SNPs between MSHR1043 and MSHR6686 were 
identified. Nineteen SNPs caused amino acid sequence altera- 
tions, one of which resulted in a nonsense mutation. Ten SNPs 
were predicted to have a disruptive effect on the protein product 
(Table 1). The strikingly high proportion of nonsynonymous 
(NS) mutations in MSHR6686 indicates that the in vivo B. pseu- 
domallei population in P314 is under strong positive selection. 
This phenomenon has been observed in relapse melioidosis (12) 
and in chronic in vivo infections caused by other free-living bac- 
teria such as P. aeruginosa (5) and B. dolosa (4). 

The single nonsense mutation in MSHR6686 occurred in 
D512_11298, resulting in truncation of the sigma factor protein 
RpoS (Q82stop). Sigma factors actively recruit RNA polymerase 
for upregulated transcription in response to certain environmen- 
tal cues. RpoS, the universal stress response sigma factor, upregu- 
lates a large array of genes in response to environmental stimuli, 
including low pH, oxidative stress, extreme temperatures, and 
carbon starvation (13, 14). This single mutation therefore has the 
pleiotropic potential to affect a large portion of the organism's 
metabolism. Although the loss of rpoS would be highly disadvan- 
tageous for an organism facing a complex natural environment, it 
potentially removes a metabolic cost and thus can confer selective 
advantages in the host environment. Identified rpoS mutants in 
other bacterial species have several selective advantages, including 
improved nutrient assimilation under nutrient-limiting condi- 
tions (15), abolished transcription of certain genes in stationary 
phase and superinduction of other stationary-phase-induced 
genes required for host survival (16), increased antibiotic resis- 
tance and persistence (17), and biofilm formation (18). Therefore, 
rpoS abolition in MSHR6686 probably has relatively minor nega- 
tive consequences in vivo. 

MSHR6686 also has a predicted deleterious NS mutation 
(A660T) inanRpoD protein (encoded by D5i2_27i23). Although 
RpoD is essential for cellular function, MSHR1043 has genes that 
encode seven distinct RpoD proteins {D512_07071, D512_11543, 
D512_10313, D512_09748, D512_30918, D512_27123, and 
D512_30563). This wide array of RpoD-encoding loci in B. pseu- 
domallei may enable the bacterium to respond and adapt to 
changing ecological niches, ranging from environmental survival 
to eukaryotic (e.g., nematode, plant, mammalian) infection. A 
null mutation in one RpoD locus is unlikely to be lethal, although 
it is expected to adversely affect certain regulatory pathways. The 
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TABLE 1 SNP differences between the initial (MSHR1043) and 139-month (MSHR6686) B. pseudomallei isolates from P314 

Affected gene 



Location in 


Nucleotide 


Amino acid 


in MSHR1043 




Mutated in 




Putative protein 


MSHR1043 


change 


change 


(K96243) 


SNP effect" 


MSHR1655? 


Affected protein(s) 


function(s) 


seqOOOl 


C- 


■*T 


T194M 


D512_01885 


NS; deleterious 


No 


NtrX 


DNA transcription regulator 


396179 








(BPSL0128) 










seqOOOl 


Q- 


->T 


AlOlA 


D512_03260 


S; neutral 


No 


Thioesterase 


Unknown 


684105 








(BPSL0430) 










seqOOOl 


C- 


■*T 


AST 


D512_03780 


NS; neutral 


No 


HPr kinase, phosphorylase 


Cell adhesion, virulence 


798196 








(BPSL0530) 










seq0003 


A- 


■*G 


H115R 


D512_07998 


NS; neutral 


Yes 


PstB 


Phosphate uptake 


154078 








(BPSL1362) 










seq0003 


T- 


->c 


I33V 


D512_08718 


NS; neutral 


Yes 


GalU 


Sugar, LPS metabolism 


331699 








(BPSL1981) 










seq0003 


A- 


■*c 


L47R 


D512_09003 


NS; deleterious 


No 


Llypothetical protein 


Unknown 


397164 








{BPSU924) 










seq0003 


Q- 


-*A 


NA" 


NA 


NA 


No 


99 bp upstream of TetR 


Unknown 


536977 


















seq0003 


G- 




R2242L 


D512_10148 


NS; neutral 


No 


Syringomycin synthetase 


Phytopathogenesis 


678249 








(BPSL1712) 










seq0003 


T— 


->c 


F84L 


D512_10298 


NS; deleterious 


No 


Llypothetical protein 


Unknown 


719972 








(BPSL1689) 










seq0003 


G- 


-*A 


G85S 


D512_10303 


NS; deleterious 


Yes 


Hypothetical protein 


Unknown 


720544 








(BPSL1688) 










seq0003 


G- 


-*A 


Q82stop 


D512_11298 


Nonsense; 


No 


RpoS 


Universal stress response 


943033 








{BPSL1505) 


deleterious 








seq0003 


G- 


-*A 


G263D 


D512_11778 


NS; deleterious 


Yes 


Glucose-6-phosphate 


Glycolysis, 


1040609 








{BPSL1413) 






isomerase 


gluconeogenesis 


seqOOOe 


T- 


->c 


L279P 


D512_14441 


NS; deleterious 


Yes 


ATP binding protein 


Molecular transport 


91557 








(BPSL2409) 










seq0007 


G- 


-*A 


NA 


NA 


NA 


No 


NA 


Unknown 


422966 


















seqOOOS 


G- 


-*c 


I637M 


D512_17625 


NS; deleterious 


Yes 


Seca 


Membrane transport; 


15374 








iBPSL3016) 








secretion 


seq0017 


c- 




S97F (S72F)'' 


D512_21859 


NS; neutral 


Yes 


Class A j3-lactamase 


J3-Lactam resistance 


181448 








{BPSS0946) 






PenA 




seq0020 


C- 


->A 


F99L 


D512_23016 


NS; neutral 


No 


Hypothetical protein 


Unknown 


32593 








(BPSS1260) 










seq0028 


G- 


■*T 


R458L 


D512_25718 


NS; neutral 


No' 


ClpB 


Heat shock 


205054 








(BPSS1501) 










seq0028 


T- 




H322R 


D512_25813 


NS; neutral 


Yes"* 


AraC regulator 


DNA regulation, virulence 


229189 








(BPSS1520) 










seq0028 


T- 




D54G 


D512_26783 


NS; deleterious 


Yes 


RsrI 


DNA methylation 


480279 








(BPSS1698) 










seq0028 


c- 


■*T 


A660T 


D512_27123 


NS; deleterious 


Yes 


RpoD 


Primary sigma factor 


553993 








(BPSS1755) 










seq0028 


G- 


■*A 


NA 


NA 


NA 


No 


NA 


Unknown 


1047356 


















seq0028 


G- 


■*A 


S2S 


D512_30428 


S; neutral 


Yes 


Hypothetical protein 


Unknown 


1280562 








(BPSS2321) 











" According to SnpEff (66) and PROVEAN (67). S, synonymous. NS, nonsynonymous. NS, not applicable. 

Resides within the previously described, conserved PeiiA 70SXXK73 motif (68). S72F causes ~4-fold increased resistance to amoxicillin-clavulanic acid but not to other j3-lactams 
(20, 69). 

MSHR1655 possesses a different mutation in this gene {insertion of a repetitive NAP motif at the 3' end). 
•> MSHR1043 possesses the mutant allele; MSHR1655 and MSHR6686 possess the wild-type allele. 
NA, not apphcable. 



rpoD mutation, but not the rpoS mutation, is also present in 
B. pseudomallei MSHR1655 (a previously sequenced strain from 
P3 14 isolated 37 months after MSHR1043; GenBank accession no. 
AAHROOOOOOOO), indicating that the rpoD mutation is stable in 
vivo. It remains unknown how these mutations affect global tran- 
scription profiles in P314, and is worthy of further study. 

The molecular mechanisms underpinning B. pseudomallei re- 



sistance to clinically administered j3-lactam antibiotics are well 
documented (19-23). An NS SNP (S72F) in the class A 
j8-lactamase genepenA (Table 1) leads to 8-fold increased resis- 
tance to amoxiciUin-clavulanate in MSHR6686 (24 ju,g/ml) but no 
increased resistance to other /3-lactams, including ceftazidime 
(20, 23). This NS SNP is also present in MSHR1655. Both 
amoxiciUin-clavulanate and ceftazidime, which are commonly 
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used to treat B. pseudomallei infections, have been administered 
on multiple occasions over the period of P314's infection. How- 
ever, unlike ceftazidime, amoxicillin-clavulanate has only been 
administered to treat other organisms complicating P314's bron- 
chiectasis. Therefore, the PenA S72F mutant has arisen and per- 
sisted in the B. pseudomallei population as a consequence of on- 
going selective pressures incurred by semiregular amoxicillin- 
clavulanate treatment for other pathogens. Amoxicillin- 
clavulanate-resistant B. pseudomallei may potentiate cross- 
resistance in other pathogens via either recombination of mutant 
penA into other species or as a side effect of localized enzymatic 
degradation. The reduced effectiveness of amoxicillin-clavulanate 
against other pathogens in the airways of P314 remains to be ex- 
plored. 

Many of the remaining NS mutations in MSHR6686 occurred 
in genes whose functions are not well characterized in B. pseu- 
domallei. Four (21%) NS point mutations were identified in hy- 
pothetical proteins (encoded by D512_09003, D512_10298, 
D512_10303, and D512_23016), three of which were predicted to 
be detrimental to protein function (Table 1). Other loss-of- 
fanction NS SNPs were identified in poorly characterized genes 
{D512_01885, D512_11778, D512_14441, D512_17625, and 
D512_26783). Although the functional role of these genes remains 
enigmatic, our study demonstrates that they are all dispensable in 
chronic infection and their loss may even be beneficial, either 
because these loci elicit a strong immunological response or oth- 
erwise have evolutionary costs that select for their inactivation. 

Genome-wide indels between P314 isolates demonstrate 
strong selective pressure toward loss-of-function mutations. A 
previous study of relapse B. pseudomallei isolate pairs found that 
only 40% of the indels found occurred in putative coding se- 
quences (12). In contrast, 11/14 (79%) small (<15-bp) indels (Ta- 
ble 2) identified between MSHR1043 and MSHR6686 occurred in 
putative coding regions. Seven of these coding sequence indels are 
also present in MSHR1655. Interestingly, the three intergenic in- 
dels occurred exclusively in MSHR1043 and were not observed in 
MSHR6686, MSHR1655, or closed B. pseudomallei genomes. 
Nine indels in MSHR6686 caused frameshift mutations that al- 
tered peptide length, whereas two were in-frame indels (Table 2). 
The in-frame indels were predicted to have a neutral impact on 
protein function, in contrast to the frameshift mutations, which 
were all detrimental. The overrepresentation of indels in coding 
regions provides further evidence for positive selection in P314. 

MSHR6686 has accumulated four indels affecting lipopolysac- 
charide (LPS) biosynthesis and modification loci. LPS is an inte- 
gral component of the outer membrane of Gram-negative bacte- 
ria and is a known virulence factor in B. pseudomallei that confers 
resistance to bactericidal compounds in human serum (24). LPS is 
also critical for the successful establishment of acute B. pseudomal- 
lei infection, enabling the bacterium to escape macrophage killing 
in vitro (25). LPS is composed of three components: the 
membrane-associated lipid A, the oligosaccharide-rich core, and 
the immunogenic outer membrane O antigen. Although LPS is an 
important virulence determinant, the potent immunogenicity of 
LPS, particularly its O antigens, appears disadvantageous for long- 
term persistence. Loss of function of O-antigen biosynthesis and 
modification is a hallmark of chronic P. aeruginosa infection (5) 
and may indicate an important strategy employed by B. pseu- 
domallei to evade the immune response. 

Indel inactivation of the LPS-associated genes whil 



(D512_15766) and oacA (D512_08953) has been reported in P314 
isolates (26, 27), and these indels are maintained in MSHR6686. A 
1-bp insertion in the highly conserved wbil gene causes loss of 
function of an O-antigen biosynthesis pathway, leading to the 
characteristic "rough" LPS phenotype (caused by modification, 
reduction, or absence of O-antigen chains) and susceptibility to 
human serum (26). A coexisting 2-bp insertion in oacA results in 
an LPS that putatively reacts to B. mallei LPS-specific monoclonal 
antibody (26). B. mallei lacks OacA (26), which is involved in the 
modification of O-antigen L-6dTalp residues by acetylation at 
their 0-4 position (26, 27) and may also methylate the 0-2 posi- 
tion (26). Inactivation of OacA in MSHR6686 demonstrates a 
convergent evolutionary strategy with the host-adapted pathogen 
B. mallei. 

We identified two additional genes involved in LPS biosynthe- 
sis and modification that appear to be affected by indels in 
MSHR6686. A 1-bp insertion in the highly conserved wbiH 
(D512_15771) gene, situated immediately downstream of wbil, 
was identified in MSHR6686 but not in MSHR1655 (Table 2). The 
wbiH mutation potentially results in defunct initiation of 
O-antigen subunit assembly (28). A 2-bp deletion in D512_06755 
(encodes LPS heptosyltransferase I, involved in LPS inner core 
biosynthesis [29] ) caused a frameshift in this protein, resulting in 
loss of function. Interestingly, the wbiH and D512_06755 indels 
are not present in MSHR1655, although MSHR1655 possesses a 
novel indel in D512_06755 that results in the insertion of a repet- 
itive motif (AHL) at codon 278. This indel is also predicted to 
abolish the function of LPS heptosyltransferase I (Table 2), dem- 
onstrating that this locus is under heavy selective pressure to ac- 
quire loss-of-function mutations. The continued degradation of 
LPS biosynthesis and modification pathways indicates that the 
wbil and oacA mutants are insufficient for complete abolition of 
O-antigen immunogenicity, leading to continued downregula- 
tion by the bacterium for immune evasion. Alternatively, LPS- 
associated loci may no longer be under selection in the host envi- 
ronment and are therefore subject to genetic drift. Partial deletion 
of an O-antigen acetylase wbiA homologue, D512_20407, was also 
observed in MSHR6686. Although WbiA is required for 2-0- 
acetylation in B. pseudomallei (27), it is unclear what functional 
effect the partial deletion of the ostensibly redundant D512_20407 
wbiA locus has, if any, on LPS function. Collectively, our study 
consolidates previous findings that LPS is dispensable in chronic 
bacterial infections (4, 5). 

A 1-bp indel in virG (D512_25678 [or BPSS1494 in K96243] ), 
observed in both MSHR1655 and MSHR6686, resulted in a frame- 
shift that increased protein length from 245 to 854 amino acids 
(Table 2). VirG is part of a two-component sensor-regulator sys- 
tem for type 6 secretion system cluster 1 (T6SS1) and is essential 
for the expression of this cluster (30) . B. pseudomallei produces six 
T6SS clusters, although only T6SS1 is necessary for virulence in 
the hamster and murine models (30, 31). The T6SS1 apparatus is 
structurally similar to bacteriophage, injecting bacterial effector 
molecules into the host cell cytosol in a contact-dependent man- 
ner (32) . On the basis of these prior studies, it is likely that the virG 
mutation has resulted in the loss of expression of T6SS1, an im- 
portant virulence factor in B. pseudomallei. 

A loss-of-function indel in the periplasmic multidrug efflux 
lipoprotein AmrA (encoded hy BPSL1804 in K96243) was seen in 
MSHR1043 but not MSHR6686 or MSHR1655. AmrA is part of 
the aminoglycoside and macrolide resistance operon AmrAB- 
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TABLE 2 Indel differences between the initial (MSHR1043) and 139-month (MSHR6686) B. pseudomallei isolates from P314 

Affected gene 



Location in 




in MSHR1043 




Mutated in 




Putative function(s) 


MSHR1043 


Nucleotide change 


(K96243)' 


Indel effect{s)'' 


MSHR1655? 


Affected protein 


of protein 


seqOOOl 


CT^C 


NA"* 


NA 


No" 


NA 


Unknown 


105149 














seqOOOl 


CCGCGACGCCGAG^C 


D512_01975 


Deletion (LGVA), 


No 


Hypothetical protein 


Unknown 


416897 




(BPSL0194) 


codon 20; neutral 








seqOOOl 


GGC^G 


D512_06755 


Frameshift, codon 64; 


No' 


LPS heptosyltransferase I 


Biosynthesis, modification 


1496938 




{BPSL1120) 


premature stop 
(381-103 aa); deleterious 






of LPS core 


seq0003 


CT^C 


D512_07898 


Frameshift, codon 25; 


Yes 


Hypothetical protein 


Unknown 


131174 




(BPSL1343) 


premature stop 
(270-110 aa); deleterious 








seq0003 


G^GGA 


D512_08953 


Frameshift, codon 100; 


Yes 


OacA 


Biosynthesis, modification 


388954 




{BPSL1936) 


premature stop 
(394-110 aa); deleterious 






of LPS O antigen 


seq0003 


AC^A 


Not assigned 


Frameshift, codon 247 in 


No' 


AmrA 


Antibiotic resistance 


537757 




{BPSL1804) 


MSHR1043; increased length 
(399-419 aa); deleterious 








seq0003 


CGCCGGGGCGG^C 


D512_10308 


Frameshift, codon 85; 


Yes 


Hypothetical protein 


Unknown 


721338 




{BPSL1687) 


premature stop 
(213-179 aa); deleterious 








seq0003 


GC^G 


D512_11953 


Frameshift, codon 43; 


Yes 


Hypothetical protein 


Unknown 


1076302 




{BPSL2093) 


premature stop 
(96-73 aa); deleterious 








seq0007 


C^CCACTCG 


D512_15706 


Insertion (HS), codon 174 


Yes 


UreE 


Nitrogen fixation 


181244 




{BPSL2660) 


(205-207 aa); neutral 








seq0007 


T^TC 


D512_15766 


Frameshift, codon 272; 


Yes 


Wbil 


Biosynthesis, modification 


194989 




(BPSL2672) 


premature stop 
(637-550 aa); deleterious 






of LPS O antigen 


seq0007 


G^GC 


D512_15771 


Frameshift, codon 121; 


No 


WbiH 


Biosynthesis, modification 


196460 




{BPSL2673) 


premature stop 
(336-141 aa); deleterious 






of LPS 0 antigen 


seq0020 


ACGCAT^A 


NA 


NA 


No" 


NA 


Unknown 


30723 














seq0024 


GCATCGA^G 


NA 


NA 


No^- 


NA 


Unknown 


104413 














seq0028 


A^AC 


D512_25678 


Frameshift, codon 183; 


Yes 


DNA-binding 


DNA transcription 


195058 




{BPSS1494) 


increased length 
(245-854 aa); deleterious 




response regulator 


regulation, virulence 



" According to SnpEff (66) and PROVEAN (67). aa, amino acids. 

" Indel identified in MSHRI043; neither any other previously sequenced B. pseudomallei strains nor the two latter P3I4 isolates, MSHR1655 and MSHR6686, share these indels. 
MSHR1655 has an insertion of a repetitive AHL amino acid motif at codon 278 (total length, 384 amino acids). 
NA, not applicable. 

A 2-bp insertion at position 298 of oacA (D512_08953) and a I-bp insertion at position 815 of wbil {D512_15766) have been previously identified (25, 26). 



OprA, which causes the efflux of antibiotics such as gentamicin 
and erythromycin from the B. pseudomallei cytoplasm (33). Con- 
comitant with defective AmrAB-OprA in MSHR1043, the genta- 
micin MIC is just 1 /Ag/ml, whereas the MIC for MSHR6686 is 
12 jLtg/ml. In addition, MSHR1043 fails to grow on Ashdown's 
agar, a selective medium for B. pseudomallei that contains 4 /J-g/ml 
gentamicin (34). A neutral NS SNP at D512_25813 was also iden- 
tified in MSHR1043 that was not found in MSHR1655, 
MSHR6686 (Table 1), or any other publicly available B. pseu- 
domallei genome. The exclusivity of these mutations in 
MSHR1043 suggests that this strain resides within a lineage dis- 
tinct from that of MSHR1655 or MSHR6686 or, alternatively, that 
the MSHR1043 lineage has since become extinct. The coexistence 
of multiple lineages in vivo following a clonal inoculation event 
has been documented in isolates from P314 (35), other patients 
mfectedwith B. pseudomallei (12, 36), and other species (5). WGS 



of additional midpoint isolates is needed to confirm the evolu- 
tionary scenario. 

Large deletions have occurred in the 139-month isolate, 
MSHR6686. Large genomic deletions are the most visible result of 
the ongoing process of genome decay that occurs as bacteria adapt 
to a more restricted niche (5, 9, 12). Reductive evolution involves 
the loss of nonessential bacterial genes in the host environment, 
including analogous gene products manufactured by the host that 
can be exploited by the bacterium (37). Consistent with irrevers- 
ible host adaption in P314, MSHR6686 has cumulatively lost 
285 kb at four distinct loci. The deletions in MSHR6686 have 
reduced its genome size by 4%, to 6.93 Mbp, resulting in the loss of 
22 1 and 195 genes relative to MSHR1043 and K96243, respectively 
(see Table SI in the supplemental material). All of the deletions 
occur exclusively on chromosome 2, which contains a greater pro- 
portion of gene clusters involved in nonessential functions such as 
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secondary metabolism, environmental survival, and pathogenesis 
than does chromosome 1 (38). Several amino acid metabolism 
pathways have been deleted from MSHR6686 (see Table SI), in- 
dicating that analogous products exist in the host and are being 
utilized by the bacterium in vivo or that these biosynthesis path- 
ways are redundant in B. pseudomallei. For example, amino acid 
biosynthesis genes are frequently decayed or lost in obligate intra- 
cellular human pathogens such as the Rickettsia spp.. Chlamydia 
spp., and Mycobacterium spp. The loss of these genes therefore 
probably provides a metabolic advantage in the P314 B. pseu- 
domallei population, enabling a larger proportion of its genome to 
be committed to central functions such as replication, transcrip- 
tion, and translation (10). 

Reductive evolution is also starkly evident in B. mallei, which 
has a genome that is -1.5 Mbp smaller than that of its free-living 
ancestor B. pseudomallei. Interestingly, there is substantial parallel 
gene loss between MSHR6686 and B. mallei; -141 kb (49.5%) of 
the loci absent from MSHR6686 are also absent from B. mallei, 
spanning (in K96243) BPSS1096 to BPSS1112 and BPSS1123 to 
BPSS1203 (see Table SI). These deleted loci encode mainly puta- 
tive secondary metabolic pathways that facilitate bacterial re- 
sponses to different environmental conditions but not those en- 
countered in the mammalian host. The absence of these loci in 
both B. mallei and the P314-adapted B. pseudomallei strain pro- 
vides further insights into the divergence of B. mallei from its 
B. pseudomallei ancestor (9). 

MSHR6686 has also shed genes encoding putative lipoprotein 
capsule biosynthesis and modification, chemotaxis, motility and 
quorum sensing, fatty acid biosynthesis, DNA methylation, thia- 
mine biosynthesis, antibiotic resistance, and virulence factor bio- 
synthesis (see Table SI in the supplemental material). Many of 
these loci are not well characterized in B. pseudomallei, and several 
do not yet have known functions. However, two key loci deleted 
from MSHR6686 have previously been functionally characterized. 
The first is an efflux pump encoded by the BpeEF-OprC operon 
{D512_20002 to D512_20022) that causes the efflux of chloram- 
phenicol and trimethoprim (39); the MSHR6686 mutation has 
likely rendered this isolate susceptible to these compounds. 
Trimethoprim-sulfamethoxazole is an antibiotic frequently used 
in melioidosis treatment, although it has not been routinely ad- 
ministered to P314 because of the patient's intolerance to this 
drug. Other putative antibiotic resistance genes, including the 
BpeGH-OprD operon {D512_22681 to D512_22696) (40) and a 
metallo-j3-lactamase gene, D512_31289, were also deleted, al- 
though their substrates and functions remain elusive. Efflux pump 
mutations have also been observed in a long-term infection with 
P. aeruginosa, which, like B. pseudomallei, is naturally resistant to 
many antibiotics (5). Characterization of these antibiotic- 
resistant loci may provide novel therapeutic options for P314. 

The second previously characterized locus abolished in 
MSHR6686 is the type III secretion 3 (Bsa TTSS3) virulence clus- 
ter. TTSS apparatus are common in Gram-negative pathogens 
and have phenotypic features in common with bacterial flagella. 
TTSSs allow pathogens to directly inject eukaryotic host cells with 
virulence proteins that are functionally similar to host proteins yet 
subvert the host cell solely for the benefit of the bacterium (41). 
B. pseudomallei produces three TTSS apparatus. Unlike TTSSl 
and TTSS2, Bsa TTSS3 is essential for virulence in the hamster 
model (42), facilitating invasion of nonphagocytic cells and sub- 
sequent vacuolar escape (43). Although many of the genes encod- 



ing TTSS3 effector molecules (e.g., bopA, bopE, bapA, and bapC) 
are not mutated in MSHR6686, structural components that form 
the critical "injectisome" of Bsa TTSS3 are deleted (see Table SI in 
the supplemental material). Previous work has shown that the 
deletion of individual TTSS3 effector molecules may not reduce 
virulence; however, deletion of the highly conserved gene sctU3 
{spaS; D512_25888), which encodes a protein essential to the 
TTSS3 secretory pathway and which is also deleted from 
MSHR6686, significantly attenuates virulence (42). TTSS appara- 
tus show signs of convergent evolution, being downregulated in 
vivo in other pathogenic species (5, 44); a notable exception is 
B. mallei, which has a fuUy functional Bsa TTSS3 locus that con- 
tributes to its virulence (45). Collectively, our results suggest that 
TTSS3 is important during initial infection and maintenance of 
virulence capacity but is disadvantageous for long-term commen- 
sal B. pseudomallei infection. 

Large deletions have been reported in free-living bacterial spe- 
cies isolated from chronic infections. Comparison of P. aeruginosa 
isolates obtained 96 months apart from a cystic fibrosis patient 
revealed a large 188-kb deletion in the latter isolate, with a con- 
comitant loss of -139 genes (5). Similarly, a study of initial and 
relapse melioidosis isolate pairs revealed a 330-kb deletion be- 
tween two B. pseudomallei isolates (1258a and 1258b) obtained 
6 months apart (12). Surprisingly, this deletion affected genes 
BPSS1249 to BPSS1483 (232 genes, 10% of the replicon) located 
on chromosome 2 yet did not overlap any deleted loci in 
MSHR6686 (see Table SI in the supplemental material). There 
was also no overlap in SNPs or indels, with the exception of 
BPSS0291 {D512_20007m MSHR1043), which encodes a lipase- 
like protein that was deleted from MSHR6686 and contained a 
frameshift in 354e, a lung isolate obtained 75 months after the 
original isolate (354a) (12). In general, there was a better correla- 
tion between mutations observed in MSHR6686 and those from 
chronic infections with other pathogenic species (e.g., P. aerugi- 
nosa [5] and Burkholderia cenocepacia [44]) than from relapsed 
B. pseudomallei infections (12). Relapse melioidosis usually results 
from inadequate duration of or poor patient compliance with the 
prescribed therapy, and is defined as a recurrent clinical illness 
after the date for completion of the prescribed treatment for the 
initial infection (3, 46, 47). Thus, intact virulence capabilities are a 
necessity. The basis for these two different disease pathologies 
suggests fundamental selection differences and evolutionary tra- 
jectories. 

Altered growth rate and morphological differences between 
MSHR1043 and MSHR6686. We observed that MSHR6686 takes 
approximately 1 to 2 days longer to grow than MSHR1043 when 
plated on both rich and selective media, including chocolate agar, 
Luria-Bertani agar, and Ashdown's agar (without gentamicin) 
(Fig. 1). Growth rate and morphology differences have also been 
observed in other chronically infecting pathogens, including 
S. aureus and P. aeruginosa (7, 48). Intriguingly, B. mallei can be 
differentiated from wild-type B. pseudomallei on the basis of a 
slower growth rate on both minimal and rich media (49). These 
parallels suggest that the large deletions observed in MSHR6686, 
rather than SNPs or indels accumulated in this strain, are respon- 
sible for slower growth. 

Cause of avirulence in the initial P314 strain, MSHR1043. 
MSHR1043 is avirulent in the intraperitoneal murine model, even 
at a high infecting dose (A. Tuanyok, unpublished data). We were 
therefore interested in identifying the potential molecular basis of 
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FIG 1 Growth rate and morphology differences between P3 1 4 B. pseudomal- 
lei isolates. The initial isolate, MSHR1043, grows well after 48 and 72 h of 
incubation on selective Ashdown's agar (without gentamicin) (top and bot- 
tom left, respectively). In contrast, the 139-month isolate, MSHR6686, exhib- 
its a substantially slower growth rate and morphological differences that make 
it difficult to recognize as B. pseudomallei (48 and 72 h of growth, top and 
bottom right, respectively). The growth rate difference is likely due to consid- 
erable genetic loss affecting 221 genes on chromosome 2. 

virulence attenuation in MSHR1043. The Prokaryotic Genome 
Automatic Annotation Pipeline used for MSHR1043 annotation 
identified 13 fi-ameshift indels in this strain, of which 4 also oc- 
curred in MSHR6686 and MSHRl 655 (Table 3 ) . Only one of these 
indels causes loss-of-function of an essential virulence factor, 
WcbR, a fatty acid synthase within the 35-kb capsular polysaccha- 
ride I (CPS-I) cluster (50). Intact CPS-I prevents phagocytosis by 
host immune cells (51) and is functional in B. mallei (52). In vitro 
knockouts of wcbR have a demonstrated reduction in CPS-I pro- 
duction and expression (50). The 1-bp indel in wcbR has persisted 
throughout P314's infection (see Figure SI in the supplemental 
material), indicating that it occurred prior to MSHR1043 isola- 
tion and has favorable consequences in vivo. We speculate that this 
CPS-I mutation was responsible for virulence attenuation early in 
P314's infection and was a critical step in the progression to 
chronic-carriage disease. 

In conclusion, we have characterized genome-wide changes 
occurring in B. pseudomallei during an ongoing chronic-carriage 
infection of the human respiratory tract. The mutational pattern 
demonstrates a predominantly positive selection signal that con- 



veys benefits for host adaptation and commensalism. The slow 
growth rate of chronically infecting bacteria, their exquisite ability 
to evade killing by antibiotics and the immune system, and the 
inherent heterogeneity of chronic in vivo populations make erad- 
ication on the basis of current treatments difficult, if not impossi- 
ble. P314's infection represents a hitherto undescribed B. pseu- 
domallei scenario in the global melioidosis literature that we have 
termed "chronic carriage" to denote a long-term, persistent hu- 
man infection that has become, paradoxically, asymptomatic. Al- 
though asymptomatic carriage of B. pseudomallei has been sus- 
pected in goats (53), it has been conventionally believed that the 
organism is not carried asymptomatically by humans and that any 
culture-positive clinical specimen necessarily represents disease 
requiring therapy (54). Protracted carriage of B. pseudomallei has 
also been observed in some cystic fibrosis patients (55, 56), sug- 
gesting that B. pseudomallei commensalism may be more com- 
mon than previously thought. 

MATERIALS AND METHODS 

Ethics statement. This study has been approved by the Human Research 
Ethics Committee of the Northern Territory Department of Health and 
Menzies School of Health Research, approval number HREC 02/38 (Clin- 
ical and Epidemiological Features of Melioidosis). Written informed con- 
sent was provided by the study participant. 

CUnical history of P314 and bacterial isolates used in this study. 
P3 14 is a white female with bilateral non-cystic fibrosis bronchiectasis but 
no other melioidosis risk factors who originally presented with a chronic 
cough productive of green sputum in July 2000 when 61 years old. P314 
had previously been treated for Mycobacterium avium complex and P. 
aeruginosa pulmonary infections and had prior surgery for chronic sinus- 
itis. A chest X-ray showed patchy pulmonary infiltrates associated with 
bilateral basal bronchiectasis, and a computed tomography scan showed 
extensive cystic bronchiectasis of the left lower lobe. B. pseudomallei was 
cultured from sputum (MSHR1043) and nose and throat swabs at that 
time (57). P314's symptoms initially improved after 14 days of intrave- 
nous ceftazidime, but her sputum remained B. pseudomallei positive. 
P314 was subsequently given intravenous meropenem for a week, fol- 
lowed by 3 weeks of intravenous ceftazidime. Because of a severe 
trimethoprim-sulfamethoxazole allergy, P314 was administered doxycy- 
cline for the oral "eradication phase" of therapy, which is standard for 
melioidosis treatment completion ( 1 ) . 

P3 14 has subsequently had numerous courses of intravenous ceftazi- 
dime and meropenem, trials of nebulized ceftazidime, and prolonged 
courses of oral doxycycline and amoxicillin-clavulanate to treat periods of 
increased cough and sputum production. In hindsight, these exacerba- 
tions and even the initial melioidosis presentation may not have reflected 
disease due to B. pseudomallei infection but rather standard bacterial in- 
fective exacerbations of bronchiectasis. In August 2003, 37 months after 
MSHR1043 was isolated, P3 14 underwent surgical lobectomy of the lower 
left lung lobe. B. pseudomallei was cultured from this lung tissue 
(MSHR1655). Despite excellent recovery, P314 continued to have respi- 
ratory exacerbations that were persistently culture positive for B. pseu- 
domallei. In recent years, P314's respiratory symptoms have improved, 



TABLE 3 Frameshift indels in P314 isolates MSHR1043, MSHR1655, and MSHR6686 not found in closed or draft genome-sequenced 
B. pseudomallei genomes 



Protein 


Gene (K96243) 


Contig 


Start 


End 


Type I polyketide synthase WcbR 


BPSL2789 


seq0007 


325993 


333551 


Bbp50 integrase 


NA- 


seq0028 


553262 


553702 


Integrase 


NA 


seq0028 


987262 


988081 


Aromatic amino acid aminotransferase 


BPSS2200 


seq0028 


1122114 


1123340 



" NA, not applicable. 
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although a productive cough with yellow or green sputum persists. 
MSHR6686 was isolated in January 2012 (139 months [11.5 years] after 
MSHR1043) during a routine visit for bronchiectasis management. P314 
has since not been administered antibiotics with activity against B. pseu- 
domallei and has reported being the least symptomatic since her initial 
melioidosis diagnosis. P314's sputum remained B. pseudomallei positive 
as of January 2013. 

MSHR1043 and MSHR6686 were isolated from P314's sputum sam- 
ples by direct plating on chocolate agar. Cultures were screened against 
the TTSl assay (58) to confirm their identity as B. pseudomallei prior to 
WGS. 

Antimicrobial testing. Etests (bioMerieux, Baulkham HUls, NSW, 
Australia) were used to determine the MICs of ceftazidime, gentamicin, 
and amoxicUlin-clavulanate for P314's isolates. Etesting was carried out 
according to the manufacturer's instructions. 

WGS and hybrid assembly. High-quality and high-yield genomic 
DNA was obtained from B. pseudomallei cultures of MSHR1043 and 
MSHR6686 as previously described (59). The strains underwent paired- 
end WGS (500-bp fragments with ~800X and -lOOX coverage, respec- 
tively) using the Illumina Genome Analyzer IIx platform (lUumina Inc., 
San Diego, CA). MSHR1043 was also sequenced using the 454 Genome 
Sequencer FLX instrument (454 Life Sciences, Branford, CT). Mira 3.4.0 
(60) was used to perform de novo hybrid 454-Illumina assembly of 
MSHR1043. MSHR1043 Illumina reads were reduced to -80 X coverage 
by quality filtering with FASTX-TooUdt vO.0.13 (http://hannonlab.cshl 
.edu/fastx_toolkit/) prior to MIRA assembly. 454 and Illumina reads were 
quality filtered by MIRA as part of the hybrid assembly process. 

The raw hybrid MSHR1043 assembly was subjected to manual contig 
assembly to improve draft quality. Gap5 vl.2.11 (61) was used to check 
and, where possible, stitch MSHR1043 MIRA-generated contigs. A sec- 
ond hybrid assembly was performed using the MIRA-assembled contigs 
as a scaffold for unfiltered Illumina reads in Velvet vl . 1 .04 (62) . Regions of 
synteny between MSHR1043 and publicly available B. pseudomallei ge- 
nomes were then identified using progressiveMAUVE (v2.3.1) alignment 
(63), BLAST (64), and the SynMap module of CoGe (http: 
//genomevolution.org/CoGe/SynMap.pl). These tools enabled stitching 
of additional contigs in nonrepetitive, conserved loci. B. pseudomallei 
MSHR1655, which was isolated from P314 37 months after MSHR1043, 
was used as a reference. All contig stitches were rigorously validated by 
aligning unfiltered Illumina reads with the MSHR1043 reference using 
Burrows-Wheeler Aligner (BWA) vO.5.9 (65) with manual verification of 
correct read mapping and pairing over the putative contig joins. Variant 
calling was performed using the Genome Analysis Toolkit (GATK) 
v2.1-ll (66). All SNPs and indels were corrected, and incorrectly assem- 
bled contigs were manually separated. This process was iterated until no 
SNPs, indels, or misassemblies were identified in MSHR1043 with BWA 
and the GATK. To verify the length of MSHR1043, fiftered Illumina reads 
were realigned with the reference with subsequent assembly of unaligned 
reads using MIRA. Lastly, contigs <800 bp in size were discarded. 

SNP, indel, and deletion detection and characterization. BWA was 
used to align Illumina reads for MSHR6686 against the MSHR1043 refer- 
ence genome on the basis of default parameters for paired-end Illumina 
data. The GATK and SAM tools vO. 1 . 18 (67) were subsequently used to call 
SNPs and small indels (si 5 bp) following the removal of duplicate reads 
with Picard MarkDuplicates vl.6 (http://picard.sourceforge.net) and 
Smith-Waterman realignment of the bam file around regions with a high 
mismatch rate according to the GATK. SNPs and indels were restricted to 
haploid calls (using the ploidy filter) and filtered for quality using the 
GATK with the following parameters for SNPs and indels: clusterSize, 3; 
clusterWindowSize, 10; MLEAF, <0.95; QD, <10.0; MQ, <30; FS, >20; 
QUAE, <30; DP, <(average genome coverage)/4 or (average genome 
coverage) X 3. Larger deletions were detected in MSHR6686 by using the 
coverageBed utility of BEDTools v2.15.0 (68) on the basis of a 1-kb win- 
dow size. To detect the acquisition of exogenous DNA in MSHR6686, 
Illumina reads for MSHR6686 were aligned against MSHR1043 and as- 



sembly was performed with unaligned reads using MIRA. All mutations 
were visually verified on the basis of Illumina read mapping in Tablet 
vl. 12.08.29 (69). Two indels in seq0003 388954 (affecting oacA) and 
seq0007 194989 (affecting wbil) have been previously characterized by 
dideoxy chain termination sequencing (26) and were confidently identi- 
fied by our pipeline. Annotation of variants was performed by using 
SnpEff v3.1 (70) with manual BLAST verification against the NCBI Mi- 
crobes genome database. Functional characterization of SNPs was carried 
out using PROVEAN vl.l (71). 

Nucleotide sequence accession number. This Whole Genome Shot- 
gun project has been deposited at DDBJ/EMBL/GenBank under accession 
no. AOGUOOOOOOOO. The version described in this paper is the first ver- 
sion, AOGUOIOOOOOO. 

SUPPLEMENTAL MATERIAL 

Supplemental material for this article may be found at http://mbio.asm.org 
/lookup/suppl/doi: 10.11 28/mBio.00388- 1 3/-/DCSupplemental. 

Table 81, DOCX file, 0.1 MB. 

Figure SI, PDF file, 0.5 MB. 
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